| DataSet | GRT low | GRT high | Distance Threshold | Proximity Criterion | Deers | Observations |
|---|---|---|---|---|---|---|
| 1 | 0 | 36 | 10 | closest in time | 35 | 149 |
| 2 | 0 | 36 | 10 | nearest | 35 | 147 |
| 3 | 0 | 200 | 15 | score | 36 | 223 |
P15.2 Fortgeschrittenes Praxisprojekt
Dr. Nicolas Ferry - Bavarian National Forest Park / Daniel Schlichting - StabLab
31 Jan 2025
Model FCM levels - amongst other covariates - on spatial and temporal distance to hunting activities
Expectations:
Contains information of 809 faecal samples, including:
Samples where taken at irregular time intervals from 2020 to 2022.
Contains location and time of \(\geq\) 700 hunting events from 2020 to 2022.
Deer location at the time of hunting event is approximated by linear interpolation:
A hunting event is considered relevant to a FCM sample, if
Among the relevant hunting events, the most relevant one is defined by one the three proximity criteria:
we define the Scoring function as following:
\[ S(d, t) \propto \begin{cases} \frac{1}{d^2} \cdot f_\textbf{t}(t), t \sim \mathcal{N}(\mu, \sigma^2) &|t \leq \mu \\ \frac{1}{d^2} \cdot f_\textbf{t}(t), t \sim \mathcal{Laplace}(\mu, b) &|t > \mu \end{cases} \] where:
\[ \begin{align*} d & \text{: Distance } \\ t & \text{: Time Difference } \\ \mu & \text{: GRT target = 19 hours } \end{align*} \]
The marginal effects of distance and elapsed time since challenge on the score:
We suggest three different Datasets for Modelling
| DataSet | GRT low | GRT high | Distance Threshold | Proximity Criterion | Deers | Observations |
|---|---|---|---|---|---|---|
| 1 | 0 | 36 | 10 | closest in time | 35 | 149 |
| 2 | 0 | 36 | 10 | nearest | 35 | 147 |
| 3 | 0 | 200 | 15 | score | 36 | 223 |
For Modelling, we consider the following covariates, defined for each pair of FCM sample and most relevant hunting event:
We chose two different approaches to Modelling:
We do this seperately for all 3 datasets (nearest, closest and score).
| Model | Objective | Evaluation Metric | Max Depth | Eta | Gamma | Subsample | Colsample Bytree | Min Child Weight | Mean RMSE | SD RMSE | Number of Observations |
|---|---|---|---|---|---|---|---|---|---|---|---|
| last | reg:squarederror | rmse | 4 | 0.1635 | 5.850 | 0.5918 | 0.9921 | 4.640 | 168.6336 | 24.40957 | 149 |
| nearest | reg:squarederror | rmse | 4 | 0.1661 | 5.893 | 0.5956 | 0.9832 | 4.747 | 151.3186 | 17.91780 | 147 |
| score | reg:squarederror | rmse | 5 | 0.1744 | 5.834 | 0.6063 | 1.0000 | 4.766 | 147.9845 | 16.50250 | 223 |
Family: Gamma
Log link for interpretability
Let \(i = 1,\dots,N\) be the indices of deer and \(j = 1,\dots,n_i\) be the indices of FCM measurements for each deer
\[ \begin{eqnarray} \textup{FCM}_{ij} &\sim& \mathcal{Ga}\left( \nu, \frac{\nu}{\mu_{ij}} \right) \\ \mu_{ij} &=& \mathbb{E}(\textup{FCM}_{ij}) = \exp(\eta_{ij}) \\ \eta_{ij} &=& \beta_0 + \beta_1 \textup{Pregnant}_{ij} + \beta_2 \textup{NumberOtherHunts}_{ij} + \\ && f_1(\textup{TimeDiff}_{ij}) + f_2(\textup{Distance}_{ij}) + \\ && f_3(\textup{SampleDelay}_{ij}) + f_4(\textup{DefecationDay}_{ij}) + \\ && \gamma_{i}, \\ \gamma_i &\overset{\mathrm{iid}}{\sim}& \mathcal{N}(0, \sigma_\gamma^2). \end{eqnarray} \]
| Dataset | Term | Estimate | Std_Error |
|---|---|---|---|
| Closest in Time | (Intercept) | 5.824 | 0.053 |
| Closest in Time | NumOtherHunts | -0.137 | 0.061 |
| Dataset | Term | Estimate | Std_Error |
|---|---|---|---|
| Nearest | (Intercept) | 5.812 | 0.054 |
| Nearest | NumOtherHunts | -0.103 | 0.060 |
| Dataset | Term | Estimate | Std_Error |
|---|---|---|---|
| Highest Score | (Intercept) | 5.905 | 0.081 |
| Highest Score | NumOtherHunts | -0.016 | 0.014 |
| Category | Subcategory | Description |
|---|---|---|
| Diagnostics | QQ Plot | Residuals mostly follow expected distribution |
| Diagnostics | Residuals vs Predictor | No major pattern |
| Diagnostics | Histogram | Reasonable fit, some variance |
| Diagnostics | Observed vs Fitted | Moderate spread, some unexplained variance |
| Random Effects | Time & Space Effects | Weak or inconsistent |
| Random Effects | Sample Delay | Shows some effect |
| Linear Effects | other hunting events | No significant impact |
How to minimize spatial and temporal distance at the same time?
How to use a bigger Part of the Data?
Effect of Hunting on Red Deer